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Abstract 

We investigate methods for variance reduction and the ehmination 
of systematic error in a Fourier accelerated Langevin scheme for gen- 
eral spin models. We present resuhs for the SU{3) x SU{3)/SU{3) 
model in two-dimensions that are consistent with those from multi- 
grid methods. We argue that the timing for the Langevin method 
makes it comparable to multi-grid for a given level of error. 
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1 Introduction 



Langevin simulation methods may be applied to any continuous spin model 
[^. While certain models of the 0{N) type may respond best to cluster 
algorithms Q there is no comparable method for those of SU{N) or CP{N) 
type. These have been tackled by multi-grid methods with good results 
[P, 0, ^ . We show that when the appropriate variance reduction techniques 
and extrapolation methods for the removal of systematic errors have been 
applied, the Langevin scheme can compete on a reasonably equal footing 
with the multi-grid approach at least for the case = 3 . Higher values 
of remain to be investigated. A preliminary version of this work was 
reported at LatticeOl ^ 



2 Langevin Scheme 

For the purposes of exposition we will assume that we are dealing with an 
SU{N) X SU{N)/SU{N) model on an hyper-cubical lattice. We wih 
report results with A^ = 3 and D = 2 for various values of L . The action is 

where £*„ is an element of SU (N) associated with site n and ^ runs over 
forward pointing nearest neighbour sites. The probability distribution we 
wish to simulate is Po{S) oc exp(^(S')) . This distribution is the stationary 
solution of the Fokker-Planck equation 

-^P{S,t)=HP{S,t) , (2) 

where 

H=Y,Kr,mD^^{D'^-U^^) . (3) 
nma 

Here -D," is the covariant derivative appropriate to site n in group direction 
a defined as follows j^. Let {A"} be a basis of generators for the defining 
representation of SU{N) . They satisfy the Lie algebra relation [A'^jA''] = 
Cabc^'^ where the {Cabc} are the structure constants of the group. Let e'^'^S 
stand for the set of spins {e'^"^'' Sn} ■ Then 

D:fiS) = j^fie^-^S)l^, . (4) 
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The covariant derivatives satisfy the Lie algebra relation [D", L>^] = —SnmCabcD'^ 
The drift term in the Fokker-Planck equation is = D!^A(S) . Clearly for 
any choice of the "diffusivity" tensor Knm we have HPq{S) = . 

The presence of Knm allows us to exploit the idea of Fourier acceleration 



[10|. In order to preserve translation invariance K^m depends only on the 
separation n — m oi the two sites. The precise form of the its functional 
depencence is fixed by specifying the shape of its Fourier transform. Guided 
by ideas of free field theory we choose fTT 



k{k) 



{L{k) + M2)max 
L{k)+M'^ 



(5) 



where L{k) is the lattice Laplacian and M is referred to as the acceleration 
parameter and is in practice adjusted to the expected value of the correlation 
length under investigation. 

The hermitian conjugate operator of H which we will use below is 



\ n ' ^n) m 



(6) 



We set the common diagonal value of the diffusivity tensor Knn = K . 
The simulation was performed using a two-stage Fourier accelerated Runge- 
Kutta algorithm presented in ref [|^ in a form appropriate to 0{N) groups. 
For convenience we give the algorithm here in a more general notation. The 
updating step proceeds by first constructing an intermediate position in 
configuration space. 



r,(old) oil) 



Q(old) 



where 



'Im 



The matrices A and B are 
and 
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Bnm — ^nm ~l~ 

The final step to the updated configuration is 

o(old) _^ r,(new) _ J^^'^K'^ n{o\d) 



(7) 
(8) 

(9) 
(10) 

(11) 
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where 

e(2)« = A„^?7««Ar + VA7(i?i) (r,W- + rj(^)- a) , (12) 

\ / nm 

the set of quantities {??m'*"} and {r/m^"} being independent gaussian ran- 
dom numbers with zero mean and unit variance. The superfix (1) on the 
drift term indicates it has been evaluated at the intermediate configuration 

3 Variance Reduction 

The method of modifying an estimator so that its mean is unchanged but 
its variance is reduced has been commonly exploited. In the present case 
we use the structure of the Langevin process itself to create such variance 
reduced estimators. The technique is based on the result that the probabil- 
ity distribution Po{S) we use for averaging is the null eigenfunction of the 
operator H in the Fokker-Planck equation. We have 

HPriS) = ErPr{S) , (13) 

where Ej. r > 1 and Eq = Q . We normalise Pq{S) so that 

{f{S)) = J d^i{S)Po{S)f{S) , (14) 

where dfi{S) is the group invariant measure on the manifold of spins. Be- 
cause HPo{S) = it follows after an appropriate integration by parts that 

{f{S)) = I dfi{S)Po{S)W{H^)f{S) , (15) 

where ^^(z) is any power series in z such that W{0) = 1 . This implies that 
we can use f'{S) = W{H^)f{S) as an estimator for the same quantity as 
f{S). There is then the possibility that f'{S) has a smaller variance than 
f {S) itself. 

The set {ipriS) = Pr{S)/Po{S)} are the eigenfunctions of W 

H^r{S) = ErMS) ■ (16) 

They satisfy the orthogonality relations 

I d^tiS)Pr{S)MS) = Srs ■ (17) 
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Clearly i^o{S) = 1 . If we expand f{S) on the set {ipr{S)} we obtain 

oo 

f{S) = {fiS)) + J2o^rAiS) , (18) 

r=l 

for some set of coeficients {a^} . The new estimator f'{S) is then 

oo 

f'{S) = {f{S)) + Y,C^rWiEr)MS) , (19) 

r=l 

and its variance is 

oo 

a\f) = J2iarWiEr)f . (20) 

r=l 

When W{z) = 1, (T^(/') reduces to (7^(/) . If now we minimise (7^(/') with 
respect to the parameters in W{z) we can expect to find cr^(/') < c^(/) . 
A simple choice which we implement here is VF(z) = 1 + bz . Strategies for 
choosing the value of b are explained in the next section. 

4 Systematic Errors 

The above theory is only applicable to the simulated results on the assump- 
tion that the simulated probability distribution has no systematic errors. In 
fact our Langevin scheme exhibits systematic errors that are O(Ar^) . We 
deal with this problem by a statistically controlled extrapolation proceedure 
combined with variance reduction. 

In outline the procedure for evaluating the mean of an observable f{S) 
is to measure (/)and(/) where / = f together with the correlators 
(/^), (/^)and(//) for a number of runs with different values of values of 
Ar^ . For the smallest of these values Ar^ say, wc obtain the value of b that 
yields the minimum variance for the estimator f = f + bf . We have 

^'(/') = ^'(/) + ^ba\f, /) + b'a\f) , (21) 
and minimum therefore occurs when 

b=-c7\fj)/a\f) , (22) 
where (/,/) = (//)- (/)(/) • 
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Having established the value of b and hence the precise form of /' in this 
way we compute the (/') and cr^(/') for the other larger choices of At . We 
then do a least squares fit on the assumption that 

(/')= (/')lAr=o + /3Ar^ • (23) 
That is we minimise with respect to a and P where 

x' = J:^if'^-^-^^^r?f ■ (24) 

Here i labels the run that uses time step Atj and fl and af are the estimates 
of the mean and variance of /' based on run run. The minimising value 
of a yields our unbiased estimate for (/) and is given by 

^ (e4)(e.^)-(e.^)(e.^) 

(E4)(E,^)-(E.f)' 

The advantage of this proceedure is that it not only provides an estimate 
of the statistical error on the observable of interest but through the resulting 

for the fit provides a test of the assumption of linearity in Ar^ for the 
systematic error. Our evaluation of the statistical error of our unbiased 
estimate of (/) is ^/a'^{a) where 

The detailed treatment of the data involves a systematic re-binning procee- 
dure that we will discuss when we consider decorrelation of measurements 
in the next section. 

The correlation length cannot be measured directly as an observable. 
It is obtained by fitting simultaneously the estimates for a number of ob- 
servables, namely the values of the correlation function Gt = {gt), where 
gt = Tr (s}So^) , St being the spatial average of the spins {Sn} for n in 
time-slice t . Our strategy in this case is a generalization of the single oper- 
ator case discussed above. For a set of values for At we take measurements 
of the observables gt and the observables gt = H^gt forming the correla- 
tion matrices {gtgt')-, {gtgt') a^nd {gtgt') ■ Our method of proceeding is as 
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(25) 



follows. We construct a set of improved operators {g[} where g'^ = gt + btgt ■ 
The coefficients {ht} are to be chosen ultimately to minimise the statistical 
error on our estimate of the correlation length. For simplicity we worked 
in fact with a common value b for all the members of the set {ht} ■ It is 
important to note that the optimal value of b appropriate to the correlation 
length calculation proved to be different from the value appropriate to the 
susceptibility calculation. 

We then can compute the correlation matrix E^^/ = {gtg[,) — {gDid't') for 
a given At . We make the assumption that 

{9't)= {9't)\Ar=o + htAT^ > (27) 

and obtain an unbiased estimate for the Green function Gt = {gt) from the 
extrapolation to At = by minimising where 



J2 ^itl {dit -at- /3Ar2) (3^^. - - ^t'^rf) , (28) 

itt' 



where cjit is the numerical estimate for Gt in the run with At = Arj, and 
S^j^, is the matrix inverse of '^ut' the corresponding numerical estimate for 
Titt' ■ The variance matrix for the resulting {at} is 



a'^{at,at>)= V tt: — ^it"t"'^ — • (29) 
ip;^,, dgu" dgw" 

We use the {at} as the estimates for the {Gt} extrapolated to At = and 
the above variance matrix as the estimate for the extrapolated correlator 
{gtgt') — {gt) {gt') ■ Again we obtain a value for that measures the goodness 
of fit for the extrapolation. 

To these values we fit the shape 

Gt = A + Bcos\i{{t- L/2)/i) , (30) 

determining the values oi A,B and ^ by minimising 

X^ = J2i»t-Gt)^~}{a){at'-Gt') , (31) 
tt' 

where T,tt'{o,) = a'^{at,at') and $]^,^(a) is its matrix inverse. 

The functional form in eq(|30|) is not appropriate for lowest few t-values 
where G{t) is appreciably affected by contributions other than the lowest 
state. We omit these points from the fitting procedure and search for a 
consistent fit at the higher values of t with an acceptable chi^ of about 
unity per degree of freedom. 
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5 Singular Value Decomposition of E 



Because Tiff has some very small eigenvalues it is possible that there can be 
some difficulty in inverting this matrix and hence in obtaining a meaningful 
evaluation of chi^. However, this problem can be circumvented by exploiting 
the singular value decomposition (SVD) of S Q and following the procedure 
suggested by Lepage in ref|^. 

The data for Gf for different values of t are highly positively correlated. 
This means that the estimates for Gt lie on a smoother curve than the in- 
dividual errors on each point would suggest. It also means inevitably that 
there are eigenvectors of S associated with very small eigenvalues and that 
these eigenvectors have entries which change in sign: a behaviour which is 
unphysical and which is not represented in the fitting function eq(|30D. For a 
large lattice there are many such "unphysical" eigenvectors and their asso- 
ciated amplitudes in the expansion of Gt on the eigenbasis each contribute 
to chi^ weighted by the inverse of a small eigenvalue i.e., a large number. 
We then note that the eigenvalues are only statistical estimates for the true 
variances, that the number of unphysical parameters grows with lattice tem- 
poral size and that their contribution to chi^ is strongly weighted compared 
with those of the more physical eigenvectors. 

The naive fitting procedure can then produce unsatisfactory results since 
chi^ is dominated by unphysical modes and a moderate change in the fit- 
ting parameters has very little effect on their contribution to chi^ since the 
smooth fitting function has little overlap with the unphysical modes. The 
consequence is that large variations in the parameters occur in an attempt 
to reduce the bulk contribution to chi^ and the fitted values are thus con- 
trolled not by the physical signal but by statistical fluctuations in amplitudes 
known to be zero . This can result in a fitted function which clearly does 
not pass through the smooth set of data points anywhere near as well as 
it should. To remedy the situation the inverse of T, is computed by first 
making a singular value decomposition and then inverting the eigenvalues 
but replacing the inverse of all but the largest ones by zero. This removes 
the effect of unphysical modes and means that we need not add oscillatory 
fitting functions whose sole role is to soak up the statistical fluctuations on 
amplitudes known to be zero. Note that we include the constant function 
as it has an appreciable expansion on modes of S with large eigenvalues. 

In our fitting procedure we keep about six modes but check the the 
results are stable for all reasonable choices. 
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6 Numerical Results 



Our numerical results were obtained at a couplings g = 1.5 on the 32 x 32 
and g = 1.75 on the 64 x 64 lattices respectively. Measurements were taken 
every two updates after equilibriation at values Ar^ = 0.04, 0.06 & 0.08 
with a squared fourier acceleration mass of = .02 . The total number 
of iterations used was 48 x 10^, with 32 x 10^ at Ar^ = 0.04, and 8 x 10^ 
at both of Ar^ = 0.06 &: 0.08 . We employed a higher number of iterations 
at the lowest value of At'^ because of its importance in the extrapolation 
procedure to Ar^ = . 

The results are accumulated according to a standard re-binning proce- 
dure in which means, mean squares and cross correlators are compiled from 
individual results then from the averages of pairs, quadruples etc. By com- 
puting the variance of an observable from a sequence of bins and noting the 
point at which the it stabilises we can estimate a value for the Langevin cor- 
relation time of the observables in question. The value we quote r^, is that 
appropriate to the susceptibility. By applying this approach with and with- 
out operator improvement we can detect the effect of operator improvement 
on the Langevin decorrelation time. 

In Table 1 we exhibit the results of our procedure for the susceptibility x, 
and the Langevin decorrelation time for lattice sizes of 32 x 32 and 64 x 64. 
In Table 2 we show the results for the correlation length ^ . The values of 
chi^ per degree of freedom arc for the Ar^-extrapolation. The results for x 
and ^ compare quite well with the corresponding results obtained by Meyer 
who used a multi-grid method. We have comparable or smaller statistical 
errors for a smaller number of iterations of the update. 

Fig 1 shows a comparison between extrapolations for the susceptibility 
on the 64 X 64-lattice with and without the operator improvement. The value 
of chi^ for these extrapolations were very much less than unity indicating 
the assumption of O(Ar^) for the systematic errors was well verified. It is 
clear from Fig. 1 that the operator improvement also leads to a reduction 
in the the systematic error, a fact which helps to improve the extrapolation 
procedure by reducing the obliqueness of the angle at which the extrapo- 
lation line hits the Ar^ = axis. The reason may be inferred from the 
following analysis. If we accept that our choice for VF(2;) = 1 + bz is an 
approximation to the choice W{z) = exp{bz} then we see that our oper- 
ator improvement is an approximation to f'{S) = exjp{bH^ f {S)} . The 
simulation results in a probability distribution -Psijn('S') which is in error by 
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O(At^). The simulation average of the improved operator is then 

(/'('5))sim = / d/i(5)Psim('5)e'^V(5) • (32) 
Integration by parts yields 

(/'(5))sim = / MS) (e'^Psim(^)) fi^) ■ (33) 

The effective probability distribution is therefore P^g{S) = ex.p{bH} Pg^^{S) 
Clearly Pqq{S) Po{S) as 6 — > oo . Our variance reduction method 
does yield positive values of b because the operator improvement corre- 
lates strongly and negatively with the original estimator (eq(p2[)). That 
this must be the case follows directly from the non-positive nature of . 
Consequently, it is plausible that we can also expect a reduction in the sys- 
tematic error for the improved operator. We find that this is indeed the 
case. The negative correlation is illustrated in Fig. 2 where we have plotted 
the improved operator results for the various values of At'^ against b. The 
bands represent the the result it the statistical error and not only exhibit 
the expected narrowing corresponding to minimum variance but also show a 
tendency to converge around a region encompassing the true result thus in- 
dicating that the operator improvement compensates in a consistent way for 
the systematic error. This latter compensation is not perfect of course but 
is not required to be and might be further improved by including additional 
(H^)'^ terms in the operator improvement. 

In Figs. 3 & 4 we show the extrapolated correlation function Gt together 
with the fit obtained by the above procedure for 32 x 32 and 64 x 64 lattices. 
In line with our general procedure these fits were obtained with the omission 
of the first three values. The values for the chi^ per degree of freedom were 
approximately unity. The ommission of further t- values changed the fitted 
parameters only by small amounts well within the statistical errors. 

From the results in Table 1 it is clear that operator improvement drasti- 
cally reduces the Langevin correlation time. In the case of the 32 x 32-lattice 
the reduction is by a factor of 3.5 from 14 time-steps down to 4 and for the 
64 X 64-lattice by a factor of 4 from 16 to 4 . These speeded up decorrela- 
tions can be plausibly interpreted as the origin of the reduced variances that 
we achieved after operator improvement. A comparison of our results with 
those of ref |^] suggests that for a given number of iterations our statisti- 
cal errors are better by a factor of a little more than two. Translated into 
numbers of iterations this becomes a factor of approximately four or five. 
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Allowing for differences in computers we estimate that our CPU update time 
is approximately six times greater than that of refs ^ . The net result is 
that we achieve, for = 3 on the sizes of lattice we have so far been able 
to deal with, a given error in a comparable real time notwithstanding our 
need to accumulate data at different values of At in order to perform the 
extrapolation needed to eliminate the systematic error. 

A comparison of methods needs a detailed numerical study. Some gen- 
eral points can be made however. The time consuming parts of the update 
step that we use are the exponentiation of the generators to compute SU (3) 
group elements and the Fourier transform procedure required for the im- 
plementation of the Fourier accelerated update. The exponentiation of the 
generators to produce finite group elements is at the centre of the various 
updating methods. Hasenbusch and Meyer |^, ^ simplify this problem by 
organising the algorithm so that only diagonal generators need to be ex- 
ponentiated. This leads to a dependence on N for their algorithm that is 
0{N) . In our case for SU{N) x SU{N)/SU{N) the dependence is neces- 
sarily 0{N^) . For CP{N), which we have not investigated in this paper, 
it would be 0{N'^) . Superficially then it would seem that our Langevin 
scheme would lose rapidly to the updating technique used by Hasenbusch 
and Meyer for larger N values. However the outcome of any comparison de- 
pends also on the efficacy of the update in covering the configuration space 
of the spins. We have seen that per update our method augmented with 
operator improvement does a relatively better job in producing independent 
configurations. There are grounds to believe that this relative advantage will 
increase with N and that the method will remain competitive with that of 
Hasenbusch and Meyer 0]. Only a numerical trial however can properly 
settle this issue. 

We are presently applying the Langevin method with operator improve- 
ment to larger lattices in order to test for the occurence of critical slowing 
down. A vitally important aspect of the problem that we have not fully 
addressed in this paper. 
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Tables 



Table 1 



9 


L 


K 


X 


chi^ 




1.5 


32 


0.0 


59.0(7) 


0.02 


14 


1.5 


32 


0.6 


57.7(4) 


0.52 


4 


1.75 


64 


0.0 


298.8(4.2) 


0.76 


11 


1.75 


64 


0.8 


301.5(2.8) 


0.003 


3 



Table 2 
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L 


k 




chi^ 


1.5 


32 


0.0 


3.04(3) 


0.94 


1.5 


32 


0.2 


3.05(2) 


1.2 


1.75 


64 


0.0 


8.39(09) 


1.5 


1.75 


64 


0.2 


8.41(07) 


1.5 



Table Captions 

Table 1. List of results for the susceptibility (x) on two lattices with 
and without operator improvment. The chi^ per degree of freedom is 
shown for the Ar^-fit. 

Table 2. List of results for the correlation length (^) on two lattices 
with and without operator improvment. The chi^ per degree of free- 
dom is shown for the Ar^-fit. 

Figure Captions 

Fig. 1. Plot of the susceptibility against Ar^ on a 64 x 64-lattice. 
The upper and lower curves show the results with and without operator 
improvement. 

Fig. 2. Plot of susceptibility on a 64 x 64-lattice against the im- 
provement parameter 5 for Ar^ = 0.04 (continuous line), 0.06 (light 
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dashes) k, 0.08 (heavy dashes). The bands indicate the statistical er- 
ror. 

Fig. 3. The correlation function on a 32 x 32-lattice for coupling 
5 = 1.5 . 

Fig. 4. The correlation function on a 64 x 64-lattice for coupling 
g = 1.75 . 
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